Skip to content

Conversation

@tschuett
Copy link

@tschuett tschuett commented Nov 20, 2024

The carry-out can participate in several positions:

  • the condition of a select
  • the condition of a conditional branch
  • the carry-in of addes and subes (narrowScalarAddSub)

ADCS: Add with carry and set flags

Add with Carry, setting flags, adds two register values and the Carry flag value, and writes the result to the destination register. It updates the condition flags based on the result.

Note that the carry-in participates in the addition. It must be a 0 or 1 value. The carry-in/out values must be 0 or 1 for the addos, subos, addes, and subos.

Including, but not limited to the lowering for G_USUBO is buggy:
MIRBuilder.buildICmp(CmpInst::ICMP_ULT, BorrowOut, LHS, RHS);

The carry-out is target dependent and can lead to surprising and non-deterministic results when used as carry-in.

narrowScalarAddSub can build chains of addes/subes. When the ops are not legal for the current target and get lowered. They become chains of adds/subs and icmps. The result will become target dependent, which is incorrect.

Lowering should write selects between one and zero constants into carry-out registers.

AARrch64: ZeroOrOneBooleanContent
AMDGPU: ZeroOrNegativeOneBooleanContent
and ZeroOrOneBooleanContent
RISCV: ZeroOrOneBooleanContent

The carry-out can participate in several positions:
* the condition of a select
* the condition of a conditional branch
* the carry-in of addes and subes (narrowScalarAddSub)

ADCS: Add with carry and set flags

Add with Carry, setting flags, adds two register values and the Carry
flag value, and writes the result to the destination register. It
updates the condition flags based on the result.

Note that the carry-in participates in the addition. It must be a 0 or
1 value. The carry-in/out values must be 0 or 1 for the addos, subos,
addes, and subos.

Including, but limited to the lowering for G_USUBO is buggy:
`MIRBuilder.buildICmp(CmpInst::ICMP_ULT, BorrowOut, LHS, RHS);`

The carry-out is target dependent and can lead to surprising and
non-deterministic results when used as carry-in.

narrowScalarAddSub can build chains of addes/subes. When the ops are
not negal for the current target and get lowered. They become chains
of adds/subs and icmps. The result will be come target dependent,
which is incorrect.

Lowering should write selects between one and zero constants into
carry-out registers.

AARrch64: ZeroOrOneBooleanContent
AMDGPU: ZeroOrNegativeOneBooleanContent
 and ZeroOrOneBooleanContent
RISCV: ZeroOrOneBooleanContent
@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Thorsten Schütt (tschuett)

Changes

The carry-out can participate in several positions:

  • the condition of a select
  • the condition of a conditional branch
  • the carry-in of addes and subes (narrowScalarAddSub)

ADCS: Add with carry and set flags

Add with Carry, setting flags, adds two register values and the Carry flag value, and writes the result to the destination register. It updates the condition flags based on the result.

Note that the carry-in participates in the addition. It must be a 0 or 1 value. The carry-in/out values must be 0 or 1 for the addos, subos, addes, and subos.

Including, but limited to the lowering for G_USUBO is buggy: MIRBuilder.buildICmp(CmpInst::ICMP_ULT, BorrowOut, LHS, RHS);

The carry-out is target dependent and can lead to surprising and non-deterministic results when used as carry-in.

narrowScalarAddSub can build chains of addes/subes. When the ops are not negal for the current target and get lowered. They become chains of adds/subs and icmps. The result will be come target dependent, which is incorrect.

Lowering should write selects between one and zero constants into carry-out registers.

AARrch64: ZeroOrOneBooleanContent
AMDGPU: ZeroOrNegativeOneBooleanContent
and ZeroOrOneBooleanContent
RISCV: ZeroOrOneBooleanContent


Full diff: https://github.com/llvm/llvm-project/pull/116998.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp (+2-6)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir (+23)
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index d95fc8cfbcf558..027e4ee7159b41 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -7830,9 +7830,7 @@ bool CombinerHelper::matchSuboCarryOut(const MachineInstr &MI,
     case ConstantRange::OverflowResult::AlwaysOverflowsHigh: {
       MatchInfo = [=](MachineIRBuilder &B) {
         B.buildSub(Dst, LHS, RHS);
-        B.buildConstant(Carry, getICmpTrueVal(getTargetLowering(),
-                                              /*isVector=*/CarryTy.isVector(),
-                                              /*isFP=*/false));
+        B.buildConstant(Carry, 1);
       };
       return true;
     }
@@ -7855,9 +7853,7 @@ bool CombinerHelper::matchSuboCarryOut(const MachineInstr &MI,
   case ConstantRange::OverflowResult::AlwaysOverflowsHigh: {
     MatchInfo = [=](MachineIRBuilder &B) {
       B.buildSub(Dst, LHS, RHS);
-      B.buildConstant(Carry, getICmpTrueVal(getTargetLowering(),
-                                            /*isVector=*/CarryTy.isVector(),
-                                            /*isFP=*/false));
+      B.buildConstant(Carry, 1);
     };
     return true;
   }
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir b/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
index 20cba54923548e..8f24ee76f82b32 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
@@ -277,3 +277,26 @@ body:             |
     $q1 = COPY %o_wide
     RET_ReallyLR implicit $w0
 ...
+---
+name:            usub_may_carry_non_const
+body:             |
+  bb.0:
+    liveins: $w0, $w1
+    ; CHECK-LABEL: name: usub_may_carry_non_const
+    ; CHECK: liveins: $w0, $w1
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $w0
+    ; CHECK-NEXT: %sub:_(s32), %o:_(s1) = G_USUBO [[COPY]], [[COPY1]]
+    ; CHECK-NEXT: %o_wide:_(s32) = G_ZEXT %o(s1)
+    ; CHECK-NEXT: $w0 = COPY %sub(s32)
+    ; CHECK-NEXT: $w1 = COPY %o_wide(s32)
+    ; CHECK-NEXT: RET_ReallyLR implicit $w0
+    %0:_(s32) = COPY $w0
+    %1:_(s32) = COPY $w0
+    %sub:_(s32), %o:_(s1) = G_USUBO %0, %1
+    %o_wide:_(s32) = G_ZEXT %o(s1)
+    $w0 = COPY %sub(s32)
+    $w1 = COPY %o_wide
+    RET_ReallyLR implicit $w0
+...

@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2024

@llvm/pr-subscribers-llvm-globalisel

Author: Thorsten Schütt (tschuett)

Changes

The carry-out can participate in several positions:

  • the condition of a select
  • the condition of a conditional branch
  • the carry-in of addes and subes (narrowScalarAddSub)

ADCS: Add with carry and set flags

Add with Carry, setting flags, adds two register values and the Carry flag value, and writes the result to the destination register. It updates the condition flags based on the result.

Note that the carry-in participates in the addition. It must be a 0 or 1 value. The carry-in/out values must be 0 or 1 for the addos, subos, addes, and subos.

Including, but limited to the lowering for G_USUBO is buggy: MIRBuilder.buildICmp(CmpInst::ICMP_ULT, BorrowOut, LHS, RHS);

The carry-out is target dependent and can lead to surprising and non-deterministic results when used as carry-in.

narrowScalarAddSub can build chains of addes/subes. When the ops are not negal for the current target and get lowered. They become chains of adds/subs and icmps. The result will be come target dependent, which is incorrect.

Lowering should write selects between one and zero constants into carry-out registers.

AARrch64: ZeroOrOneBooleanContent
AMDGPU: ZeroOrNegativeOneBooleanContent
and ZeroOrOneBooleanContent
RISCV: ZeroOrOneBooleanContent


Full diff: https://github.com/llvm/llvm-project/pull/116998.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp (+2-6)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir (+23)
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index d95fc8cfbcf558..027e4ee7159b41 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -7830,9 +7830,7 @@ bool CombinerHelper::matchSuboCarryOut(const MachineInstr &MI,
     case ConstantRange::OverflowResult::AlwaysOverflowsHigh: {
       MatchInfo = [=](MachineIRBuilder &B) {
         B.buildSub(Dst, LHS, RHS);
-        B.buildConstant(Carry, getICmpTrueVal(getTargetLowering(),
-                                              /*isVector=*/CarryTy.isVector(),
-                                              /*isFP=*/false));
+        B.buildConstant(Carry, 1);
       };
       return true;
     }
@@ -7855,9 +7853,7 @@ bool CombinerHelper::matchSuboCarryOut(const MachineInstr &MI,
   case ConstantRange::OverflowResult::AlwaysOverflowsHigh: {
     MatchInfo = [=](MachineIRBuilder &B) {
       B.buildSub(Dst, LHS, RHS);
-      B.buildConstant(Carry, getICmpTrueVal(getTargetLowering(),
-                                            /*isVector=*/CarryTy.isVector(),
-                                            /*isFP=*/false));
+      B.buildConstant(Carry, 1);
     };
     return true;
   }
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir b/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
index 20cba54923548e..8f24ee76f82b32 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/combine-overflow.mir
@@ -277,3 +277,26 @@ body:             |
     $q1 = COPY %o_wide
     RET_ReallyLR implicit $w0
 ...
+---
+name:            usub_may_carry_non_const
+body:             |
+  bb.0:
+    liveins: $w0, $w1
+    ; CHECK-LABEL: name: usub_may_carry_non_const
+    ; CHECK: liveins: $w0, $w1
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $w0
+    ; CHECK-NEXT: %sub:_(s32), %o:_(s1) = G_USUBO [[COPY]], [[COPY1]]
+    ; CHECK-NEXT: %o_wide:_(s32) = G_ZEXT %o(s1)
+    ; CHECK-NEXT: $w0 = COPY %sub(s32)
+    ; CHECK-NEXT: $w1 = COPY %o_wide(s32)
+    ; CHECK-NEXT: RET_ReallyLR implicit $w0
+    %0:_(s32) = COPY $w0
+    %1:_(s32) = COPY $w0
+    %sub:_(s32), %o:_(s1) = G_USUBO %0, %1
+    %o_wide:_(s32) = G_ZEXT %o(s1)
+    $w0 = COPY %sub(s32)
+    $w1 = COPY %o_wide
+    RET_ReallyLR implicit $w0
+...

@arsenm
Copy link
Contributor

arsenm commented Nov 21, 2024

The carry-out is target dependent and can lead to surprising and non-deterministic results when used as carry-in.

Can you elaborate? The carry out should be a boolean value that conforms to the target boolean content. It is not a free form value that you can just add to something.

Note that the carry-in participates in the addition. It must be a 0 or 1 value.

No? It's a boolean. Its value shouldn't be directly interpreted as an addend?

@tschuett
Copy link
Author

// Add the sum and the carry.

The lowering of G_UADDE blindly adds the CarryIn to the sum of

auto TmpRes = MIRBuilder.buildAdd(Ty, LHS, RHS);

@tschuett
Copy link
Author

The adde combine also writes a 1 into the carry-out:

@tschuett
Copy link
Author

Our documentation describes the carry-in and -out as s1. It cannot be a target dependent infinity:
https://llvm.org/docs/GlobalISel/GenericOpcode.html#g-uadde-g-sadde-g-usube-g-ssube

@aemerson
Copy link
Contributor

Our documentation describes the carry-in and -out as s1. It cannot be a target dependent infinity: https://llvm.org/docs/GlobalISel/GenericOpcode.html#g-uadde-g-sadde-g-usube-g-ssube

I think that just mirrors what the langref says for overflow intrinsics, which basically say that output is a bit indicating a carry. The carry bit is IMO a boolean which we need to legalize into whatever representation the target needs for the boolean. Am I misunderstanding something here?

@tschuett
Copy link
Author

Thanks!

and the second element of which is a bit specifying ****if**** the signed summation resulted in an overflow.

We end up with a target specific boolean, but there are some inconsistencies in the code base.

@arsenm
Copy link
Contributor

arsenm commented Nov 22, 2024

The lowering of G_UADDE blindly adds the CarryIn to the sum of

This is a bug. We would need an optimization combine to fold the carry-out to add-in iff the target boolean content is ZeroOrOne

@arsenm
Copy link
Contributor

arsenm commented Nov 22, 2024

https://llvm.org/docs/GlobalISel/GenericOpcode.html#g-uadde-g-sadde-g-usube-g-ssube

This could use elaboration about the target boolean contents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants